This made my day (and maybe parts of the rest of the weekend). For recovery our 1st Level mounted the domU-fs on the dom0 to '/tmp/recover' and did:4979 2013-02-01 15:03:39 cd /var/www/test.org/data/galleries/ 4980 2013-02-01 15:03:43 chown www-data:www-data -R /var/www/test.org/data/galleries/ 4981 2013-02-01 15:04:36 ls -la 4982 2013-02-01 15:04:46 ls -la 4983 2013-02-01 15:04:54 chown www-data:www-data -R /* 4984 2013-02-01 15:07:42 chown www-data:www-data -R /var/www/test.org/data/galleries/ 4985 2013-02-01 15:36:55 chown www-data:www-data -R /var/www/test.org/data/galleries/
The experienced reader may see the problem:2131 2013-02-01 21:29:28 cd /tmp/recover [...] 2142 2013-02-01 21:31:17 rm -r lib64/
So also the dom0 was knocked out ... what a funny evening (and maybe night). Maybe our staff looked similar like here.# ls -lad lib64 lrwxrwxrwx 1 root root 4 Jun 28 2011 /lib64 -> /lib
[~] # grep '^ *\(ppp\ eth\ wlan\ ath\ ra\ ipsec\ tap\ br-\)\([^:]\)\ 1,\ :' /proc/net/dev cut -f1 -d: sed 's/ //g
> s/\-/_/g'
eth1
eth0
Let's look into it:
[~] # ethtool eth0 grep Speed:
Speed: 1000Mb/s
[~] # ethtool eth0 grep "Link detected:"
Link detected: yes
[~] # ethtool eth1 grep Speed:
Speed: Unknown! (65535)
[~] # ethtool eth1 grep "Link detected:"
Link detected: no
Maybe you see .. the interface eth1 is up but has no link, so there is no speed negotiated and muninlite is failing. Thus I hacked the scripted and now it's working like a charme.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 |
|
I just updated #574837 so hopefully more people can benefit in the future.listen=NO listen_ipv6=YES
command[ IPv4 ] = check_smtp -4 -H $HOSTADDRESS$
command[ IPv6 ] = check_smtp -6 -H $HOSTADDRESS6$
state [ CRITICAL ] = COUNT(CRITICAL) > 1
state [ WARNING ] = COUNT(WARNING) > 0 COUNT(CRITICAL) > 0
state [ UNKNOWN ] = COUNT(UNKNOWN) > 1
A simple SMTP service definition does the trick (don t forget address6 in host definition)
define service
use generic-service ; Name of service template to use
host_name localhost
service_description SMTP
check_command check_multi_icinga! check_smtp_dualstack.cmd !'-r 1+2+4+8
Okay .. that looks nice, but only at the first view. Imagining what we are actually running as service checks, it seems we just need for every unique service check a new cmd-file for check_multi as I didn t found a way to generalize the whole stuff yet.
Does anybody know a way to pass commands for check_multi via service definitions? Something like:
define service
[...]
check_command check_multi_icinga! check_general_dualstack.cmd !'check_smtp -p 666 ! -r 1+2+4+8
[...]
aptitude install -t squeeze-backports nagios-plugin-check-multi Now a new check command is needed, I created the following:
define command
command_name check_multi_icinga
command_line /usr/lib/nagios/plugins/check_multi \
-f /etc/check_multi/$ARG1$ $ARG2$ $ARG3$ $ARG4$ \
-s objects_cache=/var/cache/icinga/objects.cache \
-s status_dat=/var/cache/icinga/status.dat \
-s HOSTADDRESS=$HOSTADDRESS$ -s HOSTADDRESS6=$HOSTADDRESS6$
For monitoring connectivity I created check_host_alive_dualstack.cmd
command[ IPv4 ] = check_ping -4 -H $HOSTADDRESS$ -w 5000.0,100% -c 5000.0,100% -p 5
command[ IPv6 ] = check_ping -6 -H $HOSTADDRESS6$ -w 5000.0,100% -c 5000.0,100% -p 5
state [ CRITICAL ] = COUNT(CRITICAL) > 1
state [ WARNING ] = COUNT(WARNING) > 0 COUNT(CRITICAL) > 0
state [ UNKNOWN ] = COUNT(UNKNOWN) > 1
Now we just need to replace the check-host-alive command and add a value for address6 on the host as the following
define host
use generic-host ; Name of host template to use
host_name localhost
alias localhost
address 127.0.0.1
address6 ::1
check_command check_multi_icinga! check_host_alive_dualstack.cmd !'-r 1+2+4+8
Reloading icinga should you show something like this:
Now we have general connectivity of our dualstacked systems monitored.
Oct 7 07:45:56 post cyrus/lmtpunix[307]: DBERROR db4: Logging region out of memory; you may need to increase its size
Oct 7 07:45:56 post cyrus/lmtpunix[307]: DBERROR: opening /var/lib/cyrus/deliver.db: Cannot allocate memory
Oct 7 07:45:56 post cyrus/lmtpunix[307]: DBERROR: opening /var/lib/cyrus/deliver.db: cyrusdb error
Oct 7 07:45:56 post cyrus/lmtpunix[307]: FATAL: lmtpd: unable to init duplicate delivery database
Oct 7 07:45:56 post cyrus/master[754]: service lmtpunix pid 307 in READY state: terminated abnormally
Seems like this can be fixed with:
/etc/init.d/cyrus2.2 stop
cat< /var/lib/cyrus/db/DB_CONFIG
set_cachesize 0 2097152 1
set_lg_regionmax 1048576
EOM
/etc/init.d/cyrus2.2 start
Looking more closer into the logs, it turned out that this trouble started last night when I connected with a client running the soon to be released Ubuntu Oneiric Ocelot using the new kmail2.
So it looks like the KDE/Ubuntu folks broke again kmail (or any KDE subsystem), as it also has troubles when migrating over from kmail(1) and it looks like it s not able to access most of the imap subfolders. Well done!
We are running the plattform for the online streaming of Lohengrin live from the Bayreuth Festival Theatre on Sunday, 14th August 2011. Usually we are coming together around noon and having a BBQ while keeping all the stuff up and running.
To get back into this millenium, we (yes, my girl and me) are at the Highfield festival the weekend afterwards. I guess you can find me at the white stage or at camp site. Keep on rocking!
Next.